Multi-view Positive and Unlabeled Learning
نویسندگان
چکیده
Learning with Positive and Unlabeled instances (PU learning) arises widely in information retrieval applications. To address the unavailability issue of negative instances, most existing PU learning approaches require to either identify a reliable set of negative instances from the unlabeled data or estimate probability densities as an intermediate step. However, inaccurate negative-instance identification or poor density estimation may severely degrade overall performance of the final predictive model. To this end, we propose a novel PU learning method based on density ratio estimation without constructing any sets of negative instances or estimating any intermediate densities. To further boost PU learning performance, we extend our proposed learning method in a multi-view manner by utilizing multiple heterogeneous sources. Extensive experimental studies demonstrate the effectiveness of our proposed methods, especially when positive labeled data are limited.
منابع مشابه
Robust Multi-View Boosting with Priors
Many learning tasks for computer vision problems can be described by multiple views or multiple features. These views can be exploited in order to learn from unlabeled data, a.k.a. “multi-view learning”. In these methods, usually the classifiers iteratively label each other a subset of the unlabeled data and ignore the rest. In this work, we propose a new multi-view boosting algorithm that, unl...
متن کاملUnified subspace learning for incomplete and unlabeled multi-view data
Multi-view data with each view corresponding to a type of feature set are common in real world. Usually, previous multi-view learning methods assume complete views. However, multi-view data are often incomplete, namely some samples have incomplete feature sets. Besides, most data are unlabeled due to a large cost of manual annotation, which makes learning of such data a challenging problem. In ...
متن کاملAn Information Theoretic Framework for Multi-view Learning
In the multi-view learning paradigm, the input variable is partitioned into two different views X1 and X2 and there is a target variable Y of interest. The underlying assumption is that either view alone is sufficient to predict the target Y accurately. This provides a natural semi-supervised learning setting in which unlabeled data can be used to eliminate hypothesis from either view, whose pr...
متن کاملUnlabeled Data and Multiple Views
In many real-world applications there are usually abundant unlabeled data but the amount of labeled training examples are often limited, since labeling the data requires extensive human effort and expertise. Thus, exploiting unlabeled data to help improve the learning performance has attracted significant attention. Major techniques for this purpose include semi-supervised learning and active l...
متن کاملMulti-View Co-Training of Transliteration Model
This paper discusses a new approach to training of transliteration model from unlabeled data for transliteration extraction. We start with an inquiry into the formulation of transliteration model by considering different transliteration strategies as a multi-view problem, where each view exploits a natural division of transliteration features, such as phonemebased, grapheme-based or hybrid feat...
متن کامل